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GROWTH RATE OF BINARY WORDS AVOIDING xxx R 


JAMES CURRIE AND NARAD RAMPERSAD 


Abstract. Consider the set of those binary words with no non-empty factors of the form 
xxx R . Du, Mousavi, Schaeffer, and Shallit asked whether this set of words grows polyno- 
mially or exponentially with length. In this paper, we demonstrate the existence of upper 
and lower bounds on the number of such words of length n, where each of these bounds 
is asymptotically equivalent to a (different) function of the form CVi lgn+c , where C, c are 
constants and lgn denotes the base-2 logarithm of n. 

1. Introduction 

In this paper we study the binary words avoiding the pattern xxx R . Here the notation 
x R denotes the “reversal” or “mirror image” of x. For example, the word 011011110 is an 
instance of xxx R , with x = Oil. The avoidability of patterns with reversals has been studied 
before, for instance by Rampersad and Shallit [10] and by Bischoff, Currie, and Nowotka 

mm- 

The question of whether a given pattern with reversal is avoidable may initially seem 
somewhat trivial. For instance, the pattern xx R is avoided by the periodic word (012)“’ and 
xxx R , the pattern studied in this paper, is avoided by the periodic word (01)R However, 
looking at the entire class of binary words that avoid xxx R reveals that these words have a 
remarkable structure. 

Du, Mousavi, Schaeffer, and Shallit [7] looked at binary words avoiding xxx R . They 
noted that there are various periodic words that avoid this pattern and also proved that a 
certain aperiodic word studied by Rote [12] and related to the Fibonacci word also avoids 
the pattern xxx R . They posed a variety of conjectures and open problems concerning binary 
words avoiding xxx R , notably: Does the number of such words of length n grow polynomially 
or exponentially with n? 

The growth rate of words avoiding a given pattern over a certain alphabet is a fundamental 
problem in combinatorics on words (see the survey by Shur m). Typically, for families of 
words defined in terms of the avoidability of a pattern, this growth is either polynomial or 
exponential. For instance, there are exponentially many ternary words of length n that avoid 
the pattern xx and exponentially many binary words of length n that avoid the pattern xxx 
[1]. Similarly, there are exponentially many words over a 4-letter alphabet that avoid the 
pattern xx in the abelian sense [5]. Indeed, the vast majority of avoidable patterns lead 
to exponential growth. Polynomial growth is rather rare: The two known examples are 
binary words avoiding overlaps £U and words over a 4-letter alphabet avoiding the pattern 
abwbcxaybazac [Tj. It was therefore quite natural for Du et ah to suppose that the growth 
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of binary words avoiding xxx R was either polynomial or exponential. However, we will 
show that in this case the growth is intermediate between these two possibilities. To our 
knowledge, this is the first time such a growth rate has been shown in the context of pattern 
avoidance. 

Our main result is a “structure theorem” analogous to the well-known result of Restivo 
and Salemi m concerning binary overlap-free words. The existence of such a structure 
theorem was conjectured by Shallit (personal communication) but he could not precisely 
formulate it. The result of Restivo and Salemi implies the polynomial growth of binary 
overlap-free words. In our case, the structure theorem we obtain leads to an upper bound 
of the form Cn lgn+C for binary words avoiding xxx R (here lgn denotes the base-2 logarithm 
of n). We also are able to establish a lower bound of the same type. In Table [Tj we give an 
exact enumeration for small values of n. 
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TABLE 1. Number of binary words a n of length n avoiding xxx 




The sequence (a n ) n > i is sequence A241903 of the On-Line Encyclopedia of Integer Se¬ 
quences [TJi. 

2. Blocks L and S 

Define 

/C = {z G 0{0,1}*1 : z avoids xxx R }. 

Let the transduction h : {L, S}* — > {0,1}* be defined for a sequence u = II"=o u i-> u i e 

{/-•S'} In- 

00100 Ui = S and i even 
, . . 11011 Ui = S and i odd 

riiUi) = 

00100100 Ui = L and i even 
11011011 Ui = L and i odd. 
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Then define 


M. = {u E {S', L}* : h{u ) avoids xxx R }. 

Theorem 1. Let z G 1C. Then there exists a constant C such that z can be written 

z = ph[u)st 

where \p\, |s| < C, u G Ai, and t G (e + l)(01)*(e + 1). 

Proof. Word z cannot contain 000 or 111 as a factor, so write z = f(v) where v G {ab, ad, cb, cd}*, 
and 

f : a i— y 0, b i— y 1, c i— y 00, d i— y 11. 

Write v = prs where r is a maximal string of alternating a’s and 6’s in v; thus r lies in 
(e + b)(ab)*(e + a). If |s| > 2, then we claim that |r| = 1 or \pr\ < 3. For suppose that 
|r| > 2, \pr\ > 3 and |s| > 2. Let si, S 2 be the first two letters of s. Then sj must be c 
or d\ otherwise, rs i is an alternating string of a’s and 6’s that is longer than r. Suppose 
Si = c. (The other case is similar.) Since |r| > 2 and \pr\ > 3, we conclude that prs iS 2 
has yabcs 2 as a suffix, some y G {b, d}. But then z contains a factor f(yabcs 2 ), which has a 
factor lf(abc)l = 101001 = xxx R , where x = 10. This is impossible. 

If ab or ba is a factor of v, we can write v = prs as above, with |r| > 2. This implies that 
|s| < 1 or \pr\ < 2. If \pr\ < 2, then p = e, |r| = 2, since |r| > 2; in this case pr = ab. If 
s < 1, then, since z ends in 1, either s = e or s = d. In the first case, ab is a suffix of v: in 
the second ad is a suffix. It follows that every instance of ab or ba in v either occurs in a 
prefix of length 2, or in a suffix of the form (e + b)(ab)*(e + ad). The given suffix maps under 
/ to a suffix t G (e + l)(01)*(e + 1) of z. We therefore can write z = p\Z\t such that |pi| < 2, 
and z i = f(v i), for some v\ G {ad, cb, cd}* where ba is not a factor of v\. 

Write V\ = prs where r is a maximal string of alternating c’s and gPs in v\. First of all, 
note that |r| < 7; we check that f(cdcdcdc) contains xxx R with x = OdO, and, symmetrically, 
f(dcdcdcd) contains xxx R with x = Id. We claim that |r| < 3 or \pr\ < 7. For otherwise, 
suppose that |r| > 3, and \p'r\ = 7, where p' is a suffix of p. Assume that the first letter of 
r is c. (The other case is similar.) Since |r| < 7, p' ^ e. Since r is maximal, the last letter 
of p' is a b. If \p'\ — 1, then f{p'r) = fibcdcdcd), which contains xxx R with x = lcl; this is 
impossible. If \p'\ > 2, then cb is a suffix of p' (since ab is not a factor of ui.) However, then 
p'r contains the factor cbcdc, and f(cbcdc) = 001001100 = xxx R , where x = 001, so this is 
also impossible. It follows that every instance of cdc or dcd in V\ occurs in a prefix of V\ of 
length 6. Removing a prefix p' of length at most 7 from V\ then gives a suffix u 2 , such that 
the first letter of u 2 is a or c, and neither of cdc and dcd is a factor of u 2 . We can thus write 
z = p 2 2 2 f where z 2 = /( vf), v 2 G {ad, cb, cd}*, words ba, cdc, dcd are not factors of u 2 , and 
\p 2 \ < |pi| + \f(p')\ < 2 + 2(7) — 1 = 15. (Here, at most 6 letters of p' can be c or d, since 
cdcdcdc and dededed lead to instances of xxx R .) 

Suppose that v' is any factor of w 2 of length 8. We claim that v' contains one of cd or 
dc as a factor. Since v' ^ {a, b}*, one of c and d is a factor of v'. Suppose then that c is a 
factor of v'. (The other case is similar.) Suppose that neither of cd nor dc is a factor of v'. It 
follows that v' is bcbcbcbc or cbcbcbcb ; each of these contains cbcbcbc, and f(cbcbcbc) contains 
010010010 = xxx R where x = 010. 

We may thus write v 2 = p' (nr= 0 ai) s', with n > —1, \p'\, |s'| < 7, such that each a* begins 
and ends with c or d, and neither of cd or dc is a factor of any a*. By n — —1 we allow the 
possibility that the product term is empty. As a convention, we write the product as empty 
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if |^ 21 erf + |^ 21 dc < 1 ; for i > 0, then the last letter of p' and the first letter of s' are in {c, d}. 
Suppose n > 0. Consider a*, i > 0. Without loss of generality, let a* begin with c. The letter 
preceding a* is either the last letter of a*_ i, or the last letter of p' , and must be a d. We 
cannot have |a;| = 1 , which would force a* = c; word a* is then followed by the first letter of 
aj + i or of s', which must be d. Then ded is a factor of v 2 , which is impossible. Thus |eij| > 2. 
Since cd is not a factor of a t , a,i begins with cb. Since a* ends with c or d (not in fe), a* 7 ^ cb, 
so that |eij| > 3. Since ba is not a factor of v 2 , ai therefore begins with cbc. If a* 7 ^ ebe, 
then, since cd is not a factor of a*, word a* begins with ebeb, and arguing as previously, with 
cbcbc. If ebebe is a proper prefix of a*, then a* begins with ebebeb. However, f(cb) 3 0 contains 
an instance of xxx R , so this is impossible: If a* begins with c, then ai G {cbc, cbcbc}. By the 
same reasoning, if a, begins with d, then a, G {dad, dadad}. 

Let v 3 = (p , )~ 1 v 2 (s')~ 1 = nr=o a *- Deleting np to the first 5 letters, if necessary, we 
assume that a 0 G {cbc, cbcbc} (i.e., if a 0 begins with dad or dadad, then delete these letters.) 
Then z = p 3 z 3 s 3 t where = f(v 3 ), \p 3 \ < \f{p')\ + \p 2 \ + 5 < 2(4) + 3 + 15 + 5 = 31, 

|S 3 1 = |/(s')| < 2(4) + 3 = 11. Here we use the fact that at most 4 of the letters of p' or s’ 
can be in {c, d}\ otherwise the pigeonhole principle would force an occurrence of cd or dc in 
one of these. 

We can write v 3 in the form g{u) where u G {L,S}*. Here write u = n"=o u n eac l 1 
Ui G {L, S'}, and let g be the transducer 

cbc Ui = S and i even 

dad Ui = S and i odd 

cbcbc Ui = L and i even 

dadad Ui = L and i odd. 

Thus z 3 has the form h{u) where h is the transducer 

00100 Ui = S and i even 

11011 Ui = S and i odd 

00100100 Ui = L and i even 
11011011 Ui = L and i odd. 

We have thus proved the theorem with C = max(31,11) = 31. □ 

To study the growth rate of /C, it thus suffices to study the growth rate of A4. 

The transducer h is sensitive to the index of a word modulo 2; thus, suppose r, s G {L, S'}* 
and r is a suffix of s. If |r| and |s| have the same parity, then h(r) is a suffix of h(s). However, 
if |r| and |s| have opposite parity, then h(r) is a suffix of h(s). (Here the overline indicates 
binary complementation.) 

3. Suitable pairs of words 

Let S, C G {S', L}*. Say that the pair (S, C) is suitable if 

( 1 ) |<S|, |£| are odd. 

(2) There exist non-empty i, p,p G {L, S'}* such that 

(a) h{£) =U R 

(b) h(S) =£p = \i R i R 
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(c) h(£) = £pp R p 

We see that ( S , L ) is suitable; specifically, we could choose fi = 0, £ = 0010, p = 00. 

Since |<S|, |£| are odd, the transducer h is sensitive to the index of a word modulo 2 , 
where lengths (and indices) are measured in terms of S and £; i.e., if we use length function 
IMI = l w ls + \ w \e thus, suppose r, s G {«S, £}* and r is a suffix of s. If ||r|| and ||s|| have 
the same parity, then h(r) is a suffix of h(s). However, if ||r|| and ||s|| have opposite parity, 
then h(r) is a suffix of h(s). 

Lemma 2. LetS.C G {S, L}*. Suppose that ( S,£) is suitable. 

(1) Word h(£)p~ 1 is a prefix of h{SS). 

( 2 ) Word h(S) is both a prefix and suffix of h(£). 

Proof. The first of these properties is immediate from property 2(c) of the definition of 
suitability. For the second, we see that h{C) = d.ppi R p = p R £ R p R p = p R JIbii. □ 

Now suppose that S and £ are fixed and (S, £) is suitable. Define morphism <f> : {5, £}* —t 
{£,£}* by $(5) = S£, $(£) = SCC. 

Morphism <f> is conjugate to the square of the Fibonacci morphism D, where D(£) = £S , 
D{S) = £; namely, <f> = £~ 1 D 2 £. This implies that | | < h fc (iS)|| = Tok, || < h fc (£)|| = ^ 2 fc+i, 
where Tk is the kt\i Fibonacci number, counting from Tq = T\ = 1 (we choose this indexing 
of the Fibonacci numbers for convenience: in particular, so that ||<f ) 0 (iS)|| = = 1). 

Lemma 3. Let fi G {iS,£}*. Then 

( 1 ) h($(Sfi)) is a prefix of h($(£fi)) arid h($ 2 (Sfi)) is a prefix of h($ 2 (£fi)). 

(2) h($(Sfi)) is a suffix of h($(£fi)). 

(3) h($ 2 (Sfi)) is a suffix of h(<$> 2 (£fi)). 

(4) h($(£))p _1 is a prefix of h($(SS)). 

(5) /i(<l> 2 (£))(p ) _1 is a prefix of h(& 2 (SS)). 

Proof. Since $(«S) is a prehx of $(£), $(Sfi) is a prehx of $(£/?), so that h($(Sfi)) is a 
prefix of h(Q(£fi)). Similarly, h($ 2 (Sfi)) is a prehx of /r(<f> 2 (£/3)), establishing ( 1 ). 

Since S is a suffix of £, we see that $(5) is a suffix of $(£). Because | <£>(£) | is odd, while 
|<3>(<S)| is even, it follows that h($(<S)) is a suffix of /r(<f>(£)). More generally, if fi G {£,£}*, 
h($(Sfi)) is a suffix of h($(£fi)), establishing ( 2 ). The proof of (3) is similar. 

For (4), /i($(£))p _1 = h(S££)p~ 1 = h(S£)h(£)p~ 1 , which is a prehx of h(SC)h(SS), 
which is in turn a prehx of h(S£)h(S£) = h($(SS)). 

For (5), /r($ 2 (£))(p )" 1 = h($(,5 £)$(£))(p )- 1 = h(4>( l S£))h($(£))p - 1 (since |$(<S£)| is 
odd), which is a prehx of h($(S£))h($(SS)) = /i($(<S£)<I>(S«S)), which is in turn a prehx 
of h{${S£SS£)), = h(fr 2 {SS)). 

□ 


Define the set B C {5, £}*: 
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B = (5 + £)555£(£ + 55 + SC) U £55£(£ + 55 + SC) U (5 + £)£££££(5 + £) 
U(5 + £)£5£££(5 + C) 

Ud>((5 + £)55(5 + £)) U <h((5 + £)£££5£(£ + 55 + SC)) 

U$ 2 (£££(5 + C)) U 4> 2 ((5 + £)£55(5 + C)) U <f> 2 ((5 + £)55555(5 + £)) 

Lemma 4. Let u £ A4. Then no word of B is a factor of u. 

Proof. It suffices to show that for each word b e B, h(b) contains a non-empty factor xxx R . 
B is written as a union, and we make cases based on which piece of the union b belongs: 

b G (5 + £)555£(£ + 55 + SC): In this case, it suffices to show that h(5555££)(p)~ 1 
contains a non-empty factor xxx R , because of the results of Lemma [2j In particular, 
h(SSSSCC)(p)^ 1 is a suffix of /r(£555££)(p) _1 , which is a prefix of h(CSSSCSS), which 
is a prefix of h(CSSSCSC). Again, /?,(5555££)(p)~ 1 is a prefix of hfSSSSCSS ), which is 
a prefix of hfSSSSCSC). Now 

h(5555££)(p)” 1 
= (£p) (JCW) (40 

= £pp R £ R £pp R £ R ££ R £pp R 

= £ vP R £ R £ pjJ R £ R £ £ R Tjlp R 
which contains an instance of reread with x = pp R £ R £. 

b e £55£(£ + 55 + 5£): In this case, it suffices to show that h(CSSCC)p~~ 1 contains a 
non-empty factor xxx R , because of the results of Lemma El But 

h(CSSCC)p~ l 

= (p/ Jp R £ R ){7iI)(p R £ R )(W)(£^ 

= ppn R £ R £p/i n £ R ££ R £pn R 

= p Jlp R £ R I ~jJp R £ R l £ R £nn R 

which contains the instance xxx R with x = Jlp R £ R £. 

b e (5 + £))£ 5 (5 + £)): In this case, it suffices to show that h(SC 5 S ) contains a non-empty 
factor xxx R , because of the results of Lemma El But 

h(SC 5 S) 

= (p R £ R )(££ R )(££ R )(l£ R )(U R )(££ R )(£p) 

= /i R £ R WH£ R W££ R W i £p 

= p R £ R l£ R £ £ r ££ r £ £ r ££ r £ p 


which contains the instance xxx R with x = £ R ££ R £. 

b e (5 + £)£5£££(5 + £) : In this case, it suffices to show that h(SCSCCCS ) contains 
a non-empty factor xxx R , because of the results of Lemma El Here 
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h(SCSCCCS) 

= (£^)(W)(£fi)(W)(££ R )(W)(fi R £ R ) 
= £ixW£iiW££ R Wn R £ R 
= £ nW£ f M R £ £ R Wii R £ r 


which contains the instance xxx R with x = n££ R £. 

b G $((5 + jC)SS(S + £)) : In this case, it suffices to show that h($(SSSS)) contains a 
non-empty factor xxx R , because of the results of Lemma [3l In particular, h($(SSSS)) is a 
prefix of h($(SSS£)), h($(SSSS)) is a suffix of h(&(£SSS)), and h($(SSS£)) is a suffix 
of h($(£SS£)). However, 

h(<L(<SSSS)) 

= (^ R £ R W)(ii R £ R W)(^ R t R W(^ R i R IF) 

= h r £ r 1 F//W W/iHH W 

containing an instance of xxx R , with x = £ R fi R £ R £. 

b G $((5+£)£££5£(£+iS<S+5£)) : In this case, it suffices to show that h( < L(iS£££5££))p~ 1 
contains a non-empty factor xxx R , because of the results of Lemma |3j But 

h(${SCC£SCC))p- x 

= {fi R £ R W) (h r £ r W££ r ) {JJW££ r W) (h r £ r W££ r ) (Ijl££ R W) 

= n R £ r ~££ r /i r £ r ~££ r ££ r Ji r £ r £ £ R W i n R £ R ££ R ££ R JI R £ R £ £ R JlM R W i £^M R £ 

an instance of xxx R with x = £ R ££ R n R £ R ££ R ££ R n R £ R £. 

b G < L 2 (£££(d> + £)) : In this case, it suffices to show that /i(<L 2 (£££d>)) contains a 
non-empty factor xxx R , because of the results of Lemma [3] But 

/r(<L 2 (££££)) 

= {£iI^£iI^££ R J^£l R l£ R ) {£iM R ^ R £ R W££ R ~i^££ R Jl R ) 

(, u R £ R l£ R £\l£ R ££ R ) 

= £fi£ ■ £ R £iM R ££ R Iid£ R ££ R £id ■ J R £^j£ R ££ R Ijl££ R j£ R £^£ ■ Wfi n £ R Wm n JJH R ££ R l£ R l i R £ R £ 

containing an instance of xxx R , with x = £ R £ii££ R ££ R £fi££ R ££ R £/i£. 

b G <L 2 ((<S + £)£5tS>(d> + £)) : In this case, it suffices to show that h($ 2 (S£SSS)) contains 
a non-empty factor xxx R , because of the results of Lemma [3] Now 











h(<I> 2 (5£555)) 

= (£il£ R £pl£ R ££ R ) (£^M R TiiU R W£iMKU R ) (Ijle£ R IjlU R W) {£/iW^ r £ r W££ r ) 
(J<^££ R J£W££ R W) 

= £idg*£irt ■ J R ££ R I/f££ R Iif££ R ££ R £pI ■ l^U R J^U R J^t£ R W£[I 
^li R £ R Wm R JIWH£ R JJWm R l ■ I s 

containing an instance of xxx R , with x = t R l£ R £\ju£l R lp£l R lt R l\ii. 

b G <f> 2 ((5 + £)55555(5 + £)) : In this case, it suffices to show that h(& 2 (S 7 )) contains 
a non-empty factor xxx R , because of the results of Lemma [3] Finally, 

h^ 2 (S 7 )) 

= (itix££ R £ix ££ R ££ R ) 0 ^££ R p R £ R ££ R ££ R ) (£n££ R £n££ R ££ R ) {£p££ R ^ R £ R ££ R ££ R ) 
(£ld£ R £id£ R ££ R ) (£fi££ R 'JI R £ R ££ R ££ R ) {£n££ R £id£ R ££ R ) 

= £y£ ■ l R £iiW££ R Jii££ R 'JI^££ R W^ R £ Rr £ ■ W£pW££ R IJl££ R T^££ R Wp R £ R £ 
■J R £lil£ R ££ R Iji££ R JJ^£l R j£ R li R £ R I ■ ¥ { p R £ R W££ R 
containing an instance of xxx R , with x = £ R £n££ R ££ R £n££ R /j, R £ R ££ R ££ R fi R £ R £. □ 

4. Parsing words of M using <F 
Lemma 5. Let y G {£ , 5}* fl AT Then y can be written 

V = Pi®(yi)siti, 

where \pi\, |si| < 9, y\ G {£, 5}*, and t\ G (e + 5 + S 2 + 5 3 )£5* + S *(e + £ + £5). (Here 
all lengths are as words of {£,S}*; thus, for example \pf\ = \pi\c + \Pi\s-) 

Proof. Suppose that \y\c = n. If n = 0, the lemma is true, letting t\ = y. If n = 1, write 
y = S k £S£ Since by Lemma |4l SSSS£SS cannot be a factor of y G A4, we have k < 3 or 
j < 1 ; thus we can again let t\ = y, and we are again done. 

Suppose from now on, that n > 2 , and write y = dilli S m '£)S m ‘ n+1 , where each m t > 0. 
For 1 < i < n — 1, word £S mi+1 £ has one of ££, £S£ or £SS as a prefix, depending on 
whether m l+ \ = 0,1 or rrii + 1 > 2, respectively. This implies that for 1 < i < n — 1, we have 
mi < 3, since by LemmaSl no word of 5 4 (££ + £S£ + £55) can be a factor of y G AT For 
2 < i < n — 1, we have rrii < 1 , since no word of £(5 2 + 5 3 )(££ + £S£ + £55), can appear 
in y. Since 5 4 £5 2 cannot be a factor of y G M., if m n+ 1 > 2, then m n < 3. We have thus 
established that 

y G (e + 5 + 5 2 + 5 3 )£ ((e + S)£)* ((e + 5 + 5 2 + 5 3 )£555* + 5*£(e + 5)) 

Write y = p'y't \, where 

p' G (e + 5 + 5 2 + S 3 ),y' G £((e + 5)£)*, 

h G (e + 5 + 5 2 + 5 3 )£555* + 5*£(e + 5). 

In particular, 55 is not a factor of y'. 













Without loss of generality, suppose \y\ > 7 and \y'\ > 6 . (If \y\ < 6 or \y'\ < 5, let 
Pi = p'y', yi = si = e, and the lemma holds. Write y' = p"y" s\, where \p"\ =4, |si| =2. We 
next consider the placement in y, y', y" of hypothetical factors C k , k > 3: 

• C k , k > 6 , cannot be a factor of y: If £ 6 is a factor of y, so is one of SC 6 , C 6 S or £', 
since \y\ > 7; this is impossible. 

• £ 5 can only appear in y as a prefix or suffix: Otherwise, y contains some two-sided 
extension of £ 5 . As C 6 is not a factor of y , this must be 5£ 5 5. This is impossible 
by Lemma |TJ 

• £ 4 is not a factor of py"a, where p is the last letter of p" and a is the first letter of sp 
The length 5 left extension of an occurrence of £ 4 in py"a cannot be £ 5 because of 
the previous paragraph; it must be 5£ 4 . Since SS is not a factor of y ', the further left 
extension CSC 1 must thus also be a factor of y'. However, this forces y' to contain 
one of the further left extensions CCSC 4 and SCSC 4 , which is impossible. 

• £ 3 is not a factor of y": Suppose that C 3 is a factor of y". By the previous paragraph, 
its extension 5£ 3 5 is a factor of py"o. Since SS is not a factor of y ', the extension of 
SC 3 S to CSC 3 S must be a factor of y'. One of the further left extensions CCSC 3 S 
and 5£5£ 3 5 must thus occur in y ', but these are impossible by Lemma HJ 

We have now shown that neither of S 2 and C 3 can be a factor of y". Thus 

y" G (£ + ££)(5£ + 5££)*. 

Let p'" be the longest prefix of y" of the form C k , and write y" = p'"y\ . Letting pi = p'p"p"', 
we have |pi| < 3 + 4 + 2, so the lemma holds. □ 

5. Parsing words of M using $ 2 

Lemma 6 . Let yi G {£,5}*, such that G M.. Then yi can be written 

Vi = P2$(y2)s 2 t2, 

where \p 2 \, |s 2 | < 4, y 2 G {£, 5}* and 
t 2 G ((e + £ + £ 2 + £ 3 )5£* + £*(e + 5 + SC)) (e + S + £). 

Proof. From Lemma |4] no word of 

(S + C)SS(S + £) U (5 + £)£££££(£ + 55 + SC) 

U4>(£££(5 + £)) U <f>((5 + £)£55(5 + £)) U $((5 + £)55555(5 + £)) 

can appear in y 1 . This includes all length 4 two-sided extensions of 55; it follows that 55 
can only appear in yi as a prefix or suffix. 

if M < 1, we are done. In this case, let p 2 = y i, y 2 = s 2 = t 2 = e. Therefore, we will 
assume that |j/i| > 2, and write yi = p'y's', \p'\ = |s'| = 1. Then 55 is not a factor of y'. 

Suppose that |?/|,s = n. If n = 0, the lemma is true, letting p 2 = p', y 2 = s 2 = e, t 2 = y's'. 
If n — 1, write y' = C k SD. Since £ 4 5 £ 2 is not a factor of y\, k < 3 or j < 1 ; thus we can 
let p 2 = p', t 2 = y's ', and we are again done. 

Suppose from now on, that n > 2 , and write y' = (HILi C mi S)C mn+1 , where each m, > 0. 
For 1 < i < n — 1, m , + 1 < 1, since 55 is not a factor of y'. It follows that for 1 < i < n — 2 
5 £m i+ i 5 £m i+2 p ag one Q f 5 CSC or SCC as a prefix. This implies that for 1 < i < n — 2 , we 
have mi < 3, since £ 4 5£5£ and £ 4 5££ are not factors of ij\. In fact, for 2 < i < n — 2, we 
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have m, < 2 , since SC^SCSC and SC^SCC are not factors of y\. We have thus established 
that 

y' G (e + £ + C 2 + £ 3 ) («S£ + SCC)* SC J SC k 

Since £ 4 <S £ 2 is not a factor of yi, we require k < 3 or j < 1 . Write y' = p"y 2 St" where 
p" G (e + £ + £ 2 + £ 3 ), y 2 G («S£ + SCC)*, t" G SC k SC j , k < 3 or j < 1. Let p 2 = p'p", 

s 2 — S, t 2 — t"s'. The lemma is established . □ 

6 . Parsing words of Af using <f > 3 

Lemma 7. Let y 2 G {£,5}* suc/i that $ 2 (y 2 ) G Af. TTien y 2 can be written 

y 2 = p 3 $(y 3 )s 3 , 

where \p 3 \, |s 3 | < 6 ,y 3 e {£,5}*. 

Proof. From Lemma [4l no word of 

CCC(S + £) U (S + C)CSS(S + £) U (S + C)SSSSS(S + £) 
can appear in y 2 . These include both of the length 4 right extensions of £££; it follows 
that £££ can only appear in y 2 as a suffix. They also include all of the length 5 two-sided 
extensions of CSS] Thus CSS can appear in y 2 only as a prefix or suffix. Finally, they 
include all length 7 two-sided extensions of S 5 . Thus, S 5 can only appear in y 2 as a suffix or 
prefix. If \y 2 \ < 4, we are done. Assume that |y 2 | > 5, and write y 2 = p'y's ', \p'\ = 4, |s'| = 1 . 
Then £££ is not a factor of y'. We also claim that SS is not a factor of y'. Otherwise, y 2 

has a factor pSS which is not a suffix, with \p\ = 4. However, the length 5 suffix of pSS is 

not a prefix or suffix of y 2 , and contains either S 5 or CSS as a factor; this is impossible. 

Since neither of £ 3 or S 2 is a factor of y 2 , we have ?/ <G (e + £ + C 2 )(SC + SCC)*{e + 5), 
and can write y' = £ fc< F(y 3 ) 5 - ? where A; < 2, s < 1. The lemma therefore holds. □ 

7. A HIERARCHY OF S 'S AND L’S 
Combining Lemmas 0 through [7] gives the following: 

Lemma 8 . Let y G {£,5}* fl AL T/jen y can be written 

y = pi$(p 2 $(p 3 $(y 3 )s 3 )s 2 f 2 )siti, 
w/iere |pi|, |si| < 9, |p 2 |, |s 2 | < 4, |p 3 |, \s 3 \ < 6 , and 

fi G (e + <S + S 2 + 5 3 )£5* + 5*(e + £ + £5), 
t 2 G ((e + £ + £ 2 + £ 3 )S£* + £*(e + 5 + SC)) (e + S + £). 

Corollary 9. Let y G {£,5}* D Af. Then there is a constant k such that y can be written 

y = 7T<f> 3 (y 3 )cr, 

where a can be written a 3 ^(C :i )a 2 S k a 3 , with | 7 rcricr 2 a 3 | < k. 

Lemma 10 . Suppose that (S,C) is suitable, and |/i(«S| is odd, \h(C\ even. Let 

£ = (S£5£) - 1 < h 3 (5)<S£5£, A = (5£5£)- 1 <f> 3 (£)5£5£. 

Then (£, A) is suitable, and |/i(£)| is odd, |L(A)| even. 
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Proof. Each of |E|, |A| is odd. Let 

£ = h(£S£S££S££S)£, fi = £ n h(S£), p = J R h(£S£S£) 


h{£) = h{{S£S£)~ l ^{S)S£S £) 

= h({S£S£Y 1 S£S££S£S££S££S£S£) 

= h(£S£S££S££S£S£) 

= h(£S£S££S££S)£l R h(S£) 

= ip, 

For a word z G {<S, £}* with \z\ even, we observe that h(z R ) = ( h(z)) R . Therefore, we also 
have 


E = h(£S£S££S££S£S£) 

= h{£S)h{£)h{S££S££S£S£) 

= ( h(S£)) R ££ R (h(£S£S££S££S)) R 
= fi R i R 

Further, 


h( A) = /r((5£5£)- 1 <f» 3 (£)5£5£) 

= /i( (S£S£)- l S£S££S£S££S££S£S££S££S£S£) 

= h(£S£S££S££S£S££S££S£S£) 

= h(£S£S££S££S)h(£)h(S££S££S£S£) 

= h(£S£S££S££S)£ i R (h(£S£S££S££S)) R 

= IF 

Finally, 

h{ A) = h(£S£S££S££S)£ Fh(S££S££S£S£) 

= h(£S£S££S££S)£ Fh(S£) h{£S) h(£)h(£S£S£) 

= h(£5£5££5££5)£ £ R h{S£) h{£S) I J R h(£S£S£) 

= £pfi R p. 

□ 


This result combines with Corollary [9] to allow us to parse words of JA. Let L$ 
So = S. Supposing that (Si, Lf) is suitable, let £ = Li, S = Si, and 


= L, 


L i+1 = (SiUSiLiY^^L^SiLiSiU, S i+l = 


-la.3/ 


Since (S, L) is suitable, all of the pairs (Si, Lf) will be suitable by Lemma [TO] Suppose 
y G {S, L}*nM. By repeatedly applying Corollary EJ we write y = 7tva where v G {Si, Li}*. 

n 














8. Upper bound on growth rate 


Define 


A/" = {z G {0,1}* : z avoids xxx R j. 


Theorem 11. The number of words in M of length n is 0(n lgn+c ), some constant c. 


To prove this theorem, it suffices to show that the number of words in /C of length n is 
0 (n Ign+c ), some constant c. 

From Theorem 1, it suffices to prove the following: 

Theorem 12. The number of words in A4 of length n is 0(n lgn+c ), some constant c. 

Proof of Theorem 171! Let y G M. have length n. Choose ( S,£) = (S,L). Then iteration of 
Corollary [9] gives 

y = p i ( I >3 (p 2 T 3 (p ,3 • • •p m <h 3 (e)s m • • • s 3 )s 2 )si, 
where m < (lgn)/3. For i G {1, • • • ,m} we have 

Si = a 1 ^(C Ji )a 2 ,iS ki a^i 

Since \piGz,i<7 2 ,i(7i,i\ < ft, there is a constant a such that there are at most a choices for 
(pi, (jj 3 , a i2 , Uj i). This gives a number of choices for {(p t , oy 3 , crj 2 , ^, 1 )}™ i which is polyno¬ 
mial in n. 

This leaves the problem of bounding the number of choices of the ji and kj . 

We have 

n > |$ 3 ($ 3 (- • ■ <h 3 (e)$(£ :;m )5 fcm • • • ${£ j3 )S k3 )${£ h )S k2 )$(£ jl )S kl \ 

m 

= ^(j i |$ 3i - 2 (£)| + fc i |$ 3< - 3 (5)|) 

i— 1 
m 

^ ^ (ji-£'■ Si—3 S - kiT 3i—6 ) 
i= 1 

ft follows that the number of choices for the j t , ki is less than or equal to the number of 
partitions (with repetition) of n with parts chosen from {J- 3 j}?h 0 . Since J 1 ' 3i > 2 l , this is less 
than or equal to the number of partitions of n into powers of 2. Mahler [ 8 ] showed that the 
number p(n, r) of partitions of n into powers of r satisfies 

lg p(n,r) ~ 

lg r 

thus, p(n, 2) ~ Cn lgn where C is constant. The result follows. □ 


9. Lower bound on growth 
Let -0 : { L , S}* — > { L , S'}* be given by 

'if(S) = LSL,fj(L) = LSLSL. 

Since if(S), f>(L) are palindromes, we have 

if(u R ) = (fj(u)) R ,u G {L,S}*. 

Letting (S.£) = ( S,L ), we find that = (£S£)~ 1 D 3 £S£. It follows that | , 0 A: (S)| = J- 3 k, 

\^ k (L)\=T zk+l 
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Define languages Jzf* by 


= LS*,J* +1 = il>(S(’ i )LS* 


Let Jgf = U“ 0 


A word w G has the form 

w = ■■'if{'if{LS km )LS km ~ 1 ) ■ ■ ■ )LS k2 )LS kl )LS ko 
so that the number of words of of length n is the number of partitions of n of the form 


n = 




i =0 


Since T r <2*, this is greater than or equal to the number of partitions of n of the form 


n = 


( 2 3i+1 + h 2 3i ) , 


i =0 


which is greater than or equal to the number of partitions of n of the form 


n = 


+ 1)2 


3*+l 


i =0 


This, in turn, is at least half of the number of partitions of n of the form 


n = Y J kr 2 3i+1 , 

i =0 

which is the number of partitions of n/2 of the form 

m 

n/2 = ki8\ 

i=0 

Following Mahler [8], this is p(n/ 2,8) ~ Cn lgn /n 2 , where C is constant. We will show 
that no word of h(=Sf) has a non-empty factor xxx R , so that this gives a lower bound on J\f. 
One checks the following: 

Lemma 13. No word of has any of the following factors: 

L 3 , SSL, SLSLS , LSLSLLSLSLLSLSL = L 3 ), LLSLLSLLSLSL 

LSLLSLSLLSLLSLSLLSL = ifiSLSLS). 

Theorem 14. No word of h( Jf) contains a non-empty word of the form xxx R . 

Proof. Suppose w G and xxx R is a non-empty factor of h(w). Let 

W = ((h(S) + h(L))(h(S) +Wj))* = ((00100+ 00100100)(11011 + 11011011))*. 

Thus h(w) is a factor of a word of W. Note that none of 000, 111, 0101, 1010, 001011, 
110011, 010010010, is a factor of any word of W, nor thus, of w. Also, £ = 0010 is always 
followed by 01 in any word of W, while i R = 1011 is always preceded by 01. 

If |x| < 2, then h(w) contains a factor 000, 111, 010110 or 101011. The last two contain 
0101, so this is impossible. Assume therefore that |x| > 3 and write x = x'afd^, where a, 
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(3, 7 G {0,1}. Then a[3 r y r yf3a is a factor of xxx R . Suppose that 7 = 0. (The other case 
is similar.) Since 000 is not factor of w, we can assume that [3 — 1. Since 110011 is not 
a factor of w, a/3 7 = 010. If | x\ = 3, then xxx R is 010010010, which is not a factor of w. 
We conclude that |x| > 4. Since 1010 is not a factor of w, i — 0010 is a suffix of x. Write 
x = x" t, so that 

xxx R = x"ix"U R {x") R = x"£x"h{L)(x") R . 

Since x"lx" precedes h(L) in a word of W, the length 4 suffix of x"ix" must be 1011; since 
x" follows t in h(w ), it follows that x" begins with 0. Therefore, \x"\ > 5. It follows that x" 
must end with 11011, so that, in fact, \x"\ > 6, and 011011 is a suffix of x". If \x"\ — 6, then 

xxx R = 0110110010 0110110010 0100110110 = 011011h(SSL)110110. 

This forces SSL to be a factor of w, which is impossible, since w G J 2 ?. Thus \x"\ > 7. 

Since 0101 is not a factor of w, if suffix 011011 of x" is preceded by 1, it is preceded by 
11, and h(L)£ is a suffix of x. This forces xx R to have 

h(Z)££ R h(Zp = h{L)h(L)h{L) 

as a factor, forcing LLL to be a factor of w, which is impossible. We conclude that 0011011 
is a suffix of x". Since x" follows £ in w, 01 must be a prefix of x". Suppose Oil is a prefix 
of x". Since 0011011 is a suffix, then x"tx" has factor 

0011011^011 = 00h(S)h(S)ll 

and w has a factor SSuL for some u ; this is impossible. We conclude that 010 is a prefix of 
x"\ since 0101 is not a factor of w, in fact, 0100 = i R is a prefix of x". In total, 

xxx R = l R xi i R xt l R x R i 

The ‘bracketing’ by £ and £ R forces w to contain a factor uLuLu R , where |«| is odd. 
Consider the shortest factor uLuLu R or w, where |«| is odd. 

If the last letter of u is L , then LLL is a central factor of uLu. This is impossible. Thus 
S' is a suffix of u. If u = S, then uLuLu R = SLSLS, which is not a factor of any word of 
J§f. We conclude that |u| > 1, so that |u| > 3, since |w| is odd. 

Since SSL is not a factor of w , the length 3 suffix of uL is LSL. This makes LSLSL a 
central factor of uLu R . Since SLSLS is not a factor of w , the length 3 suffix of u is LLS. If 
u = LLS, then Lu has prefix LLL, which is not a factor of w. We conclude that |w| > 5. 

Since neither of LLL and SS is a factor of w, we conclude that LSLLS is the length 5 
suffix of u. If u — LSLLS, then uLuLu R = LSLLSLLSLLSLSLLSL, with illegal factor 
LLSLLSLLSLSL. Thus |u| > 7. 

If the length 7 suffix of n is LSLSLLS, then a central factor uLu R is LS LSLLS LSLLS LSL, 
which is not a factor of w. We conclude that the length 7 suffix is SLLSLLS. 

Write w = 'i[{v)LS k for some v G , some k > 0. Since \w\l > 1, v ^ e. Then w has suffix 
LLS k , and prefix uLuLSL of uLuLu R must be a factor of i[(v). Let L(LSL) m L be a factor 
of uLu where m is as large as possible. Since uLuLSL has suffix LSLSL, and uLuLSL is a 
factor of ip(v), word L(LSL) m LSLSL must be a factor of uLuLSL. If m > 2, then uLuLu R 
has illegal factor LLSLLSLLSLSL. We conclude that m = 1, so that LLSLLSLL is not a 
factor of uLu 

In the context of uLu, word u follows the suffix LLSLLSL of uL. Therefore, u cannot 
have L as a prefix or uLu contains the factor LLSLLSLL. It follows that SL is a prefix of 
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u. However, a prefix of u cannot be SLS] otherwise uLu would have factor uLSLS which 
has illegal suffix SLSLS. ft follows that the length 3 prefix of u is SLL. 

Write 

u = SL: Lu'SL : LSL : LS 

The colons indicate boundaries in u between instances of ^(S) and L ). Thus, we may 
write u = SL r ili(u")LS , for some word u" in j£f. Since |^>(S')| = \'ip(L)\ = 1 (modulo 2), we 
have 

\u\ = \i/j(u")\ = \u"\ (modulo 2). 

Then 

uLuLu r = SLfij{u")LSLSL^{u")LSLSL('ijj(u")) R LS 
= SL^[u"Lu"L{u") r )LS. 

Recall that w = fi>(y)LS k . Although the suffix LS of uLuLu R may occur here as a prefix 
of LS k , certainly uLuLu R {LS)~ l is in We conclude that u” Lu” L[u") R is a factor of 

2z?, where u" has odd length shorter than u. This is a contradiction. □ 
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